Overview

Dataset statistics

Number of variables12
Number of observations557
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory115.6 KiB
Average record size in memory212.6 B

Variable types

Numeric10
Categorical2

Warnings

df_index has unique values Unique

Reproduction

Analysis started2021-04-05 08:50:20.201323
Analysis finished2021-04-05 08:50:30.518632
Duration10.32 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct557
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean299.6229803
Minimum0
Maximum582
Zeros1
Zeros (%)0.2%
Memory size4.5 KiB
2021-04-05T04:50:30.592435image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile31.8
Q1159
median304
Q3443
95-th percentile554.2
Maximum582
Range582
Interquartile range (IQR)284

Descriptive statistics

Standard deviation166.9263764
Coefficient of variation (CV)0.5571214074
Kurtosis-1.163090008
Mean299.6229803
Median Absolute Deviation (MAD)142
Skewness-0.07214802646
Sum166890
Variance27864.41515
MonotocityStrictly increasing
2021-04-05T04:50:30.700653image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.2%
4001
 
0.2%
3941
 
0.2%
3951
 
0.2%
3961
 
0.2%
3971
 
0.2%
3981
 
0.2%
3991
 
0.2%
4011
 
0.2%
4091
 
0.2%
Other values (547)547
98.2%
ValueCountFrequency (%)
01
0.2%
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
61
0.2%
71
0.2%
81
0.2%
91
0.2%
ValueCountFrequency (%)
5821
0.2%
5811
0.2%
5801
0.2%
5791
0.2%
5781
0.2%
5771
0.2%
5761
0.2%
5751
0.2%
5741
0.2%
5731
0.2%

Age
Real number (ℝ≥0)

Distinct72
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.95691203
Minimum4
Maximum90
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:30.814855image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile18
Q133
median45
Q358
95-th percentile72
Maximum90
Range86
Interquartile range (IQR)25

Descriptive statistics

Standard deviation16.2961004
Coefficient of variation (CV)0.3624826454
Kurtosis-0.5710065581
Mean44.95691203
Median Absolute Deviation (MAD)13
Skewness-0.06452886765
Sum25041
Variance265.5628883
MonotocityNot monotonic
2021-04-05T04:50:30.919646image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6034
 
6.1%
4525
 
4.5%
5023
 
4.1%
3220
 
3.6%
4820
 
3.6%
3819
 
3.4%
4219
 
3.4%
5518
 
3.2%
6517
 
3.1%
4616
 
2.9%
Other values (62)346
62.1%
ValueCountFrequency (%)
42
0.4%
61
 
0.2%
72
0.4%
81
 
0.2%
101
 
0.2%
111
 
0.2%
122
0.4%
134
0.7%
142
0.4%
151
 
0.2%
ValueCountFrequency (%)
901
 
0.2%
851
 
0.2%
841
 
0.2%
781
 
0.2%
7514
2.5%
744
 
0.7%
732
 
0.4%
726
1.1%
709
1.6%
692
 
0.4%

Gender
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size33.6 KiB
Male
419 
Female
138 

Length

Max length6
Median length4
Mean length4.49551167
Min length4

Characters and Unicode

Total characters2504
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowMale
4th rowMale
5th rowMale
ValueCountFrequency (%)
Male419
75.2%
Female138
 
24.8%
2021-04-05T04:50:31.098784image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-05T04:50:31.159651image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
male419
75.2%
female138
 
24.8%

Most occurring characters

ValueCountFrequency (%)
e695
27.8%
a557
22.2%
l557
22.2%
M419
16.7%
F138
 
5.5%
m138
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1947
77.8%
Uppercase Letter557
 
22.2%

Most frequent character per category

ValueCountFrequency (%)
e695
35.7%
a557
28.6%
l557
28.6%
m138
 
7.1%
ValueCountFrequency (%)
M419
75.2%
F138
 
24.8%

Most occurring scripts

ValueCountFrequency (%)
Latin2504
100.0%

Most frequent character per script

ValueCountFrequency (%)
e695
27.8%
a557
22.2%
l557
22.2%
M419
16.7%
F138
 
5.5%
m138
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2504
100.0%

Most frequent character per block

ValueCountFrequency (%)
e695
27.8%
a557
22.2%
l557
22.2%
M419
16.7%
F138
 
5.5%
m138
 
5.5%

Total_Bilirubin
Real number (ℝ≥0)

Distinct111
Distinct (%)19.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.345780969
Minimum0.4
Maximum75
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:31.227502image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.4
5-th percentile0.6
Q10.8
median1
Q32.6
95-th percentile16.62
Maximum75
Range74.6
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation6.328424882
Coefficient of variation (CV)1.891464187
Kurtosis35.84007349
Mean3.345780969
Median Absolute Deviation (MAD)0.3
Skewness4.830356055
Sum1863.6
Variance40.04896148
MonotocityNot monotonic
2021-04-05T04:50:31.334024image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.887
15.6%
0.775
 
13.5%
0.955
 
9.9%
0.642
 
7.5%
126
 
4.7%
1.119
 
3.4%
1.814
 
2.5%
1.413
 
2.3%
1.312
 
2.2%
1.711
 
2.0%
Other values (101)203
36.4%
ValueCountFrequency (%)
0.41
 
0.2%
0.55
 
0.9%
0.642
7.5%
0.775
13.5%
0.887
15.6%
0.955
9.9%
126
 
4.7%
1.119
 
3.4%
1.28
 
1.4%
1.312
 
2.2%
ValueCountFrequency (%)
751
0.2%
42.81
0.2%
32.61
0.2%
30.81
0.2%
30.52
0.4%
27.71
0.2%
27.21
0.2%
26.31
0.2%
251
0.2%
23.31
0.2%

Direct_Bilirubin
Real number (ℝ≥0)

Distinct79
Distinct (%)14.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.50951526
Minimum0.1
Maximum19.7
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:31.445591image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.1
Q10.2
median0.3
Q31.3
95-th percentile8.5
Maximum19.7
Range19.6
Interquartile range (IQR)1.1

Descriptive statistics

Standard deviation2.858843269
Coefficient of variation (CV)1.893881661
Kurtosis10.88827195
Mean1.50951526
Median Absolute Deviation (MAD)0.2
Skewness3.162135724
Sum840.8
Variance8.172984837
MonotocityNot monotonic
2021-04-05T04:50:31.554838image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.2190
34.1%
0.157
 
10.2%
0.349
 
8.8%
0.822
 
3.9%
0.419
 
3.4%
0.518
 
3.2%
0.616
 
2.9%
113
 
2.3%
1.312
 
2.2%
1.611
 
2.0%
Other values (69)150
26.9%
ValueCountFrequency (%)
0.157
 
10.2%
0.2190
34.1%
0.349
 
8.8%
0.419
 
3.4%
0.518
 
3.2%
0.616
 
2.9%
0.711
 
2.0%
0.822
 
3.9%
0.95
 
0.9%
113
 
2.3%
ValueCountFrequency (%)
19.71
0.2%
18.31
0.2%
17.11
0.2%
14.21
0.2%
14.11
0.2%
13.71
0.2%
12.81
0.2%
12.62
0.4%
12.11
0.2%
11.82
0.4%

Alkaline_Phosphotase
Real number (ℝ≥0)

Distinct261
Distinct (%)46.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean292.9802513
Minimum63
Maximum2110
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:31.673044image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum63
5-th percentile135
Q1176
median208
Q3298
95-th percentile725.2
Maximum2110
Range2047
Interquartile range (IQR)122

Descriptive statistics

Standard deviation247.7258681
Coefficient of variation (CV)0.845537769
Kurtosis16.95388999
Mean292.9802513
Median Absolute Deviation (MAD)49
Skewness3.69082235
Sum163190
Variance61368.10572
MonotocityNot monotonic
2021-04-05T04:50:31.778774image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21511
 
2.0%
19811
 
2.0%
29811
 
2.0%
19510
 
1.8%
19010
 
1.8%
1829
 
1.6%
1658
 
1.4%
1808
 
1.4%
1887
 
1.3%
2027
 
1.3%
Other values (251)465
83.5%
ValueCountFrequency (%)
631
0.2%
751
0.2%
901
0.2%
922
0.4%
971
0.2%
981
0.2%
1002
0.4%
1021
0.2%
1031
0.2%
1051
0.2%
ValueCountFrequency (%)
21101
0.2%
18961
0.2%
17501
0.2%
16301
0.2%
16201
0.2%
15801
0.2%
15501
0.2%
14201
0.2%
13502
0.4%
11241
0.2%

Alamine_Aminotransferase
Real number (ℝ≥0)

Distinct148
Distinct (%)26.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.69658887
Minimum10
Maximum2000
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:31.887455image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14.8
Q123
median35
Q360
95-th percentile222
Maximum2000
Range1990
Interquartile range (IQR)37

Descriptive statistics

Standard deviation180.2557023
Coefficient of variation (CV)2.29051481
Kurtosis55.05109083
Mean78.69658887
Median Absolute Deviation (MAD)15
Skewness6.853131303
Sum43834
Variance32492.11821
MonotocityNot monotonic
2021-04-05T04:50:31.997124image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2525
 
4.5%
2021
 
3.8%
2218
 
3.2%
2817
 
3.1%
1817
 
3.1%
2117
 
3.1%
3015
 
2.7%
1514
 
2.5%
2413
 
2.3%
4812
 
2.2%
Other values (138)388
69.7%
ValueCountFrequency (%)
104
 
0.7%
112
 
0.4%
1210
1.8%
134
 
0.7%
148
1.4%
1514
2.5%
168
1.4%
178
1.4%
1817
3.1%
196
 
1.1%
ValueCountFrequency (%)
20001
0.2%
16801
0.2%
16301
0.2%
13501
0.2%
12502
0.4%
9501
0.2%
7901
0.2%
7791
0.2%
6221
0.2%
5091
0.2%

Aspartate_Aminotransferase
Real number (ℝ≥0)

Distinct173
Distinct (%)31.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108.8258528
Minimum10
Maximum4929
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:32.106833image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile15
Q125
median41
Q386
95-th percentile400.2
Maximum4929
Range4919
Interquartile range (IQR)61

Descriptive statistics

Standard deviation292.9194591
Coefficient of variation (CV)2.691634861
Kurtosis149.6502062
Mean108.8258528
Median Absolute Deviation (MAD)20
Skewness10.57151891
Sum60616
Variance85801.80955
MonotocityNot monotonic
2021-04-05T04:50:32.214544image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2316
 
2.9%
2014
 
2.5%
3014
 
2.5%
2114
 
2.5%
2513
 
2.3%
2813
 
2.3%
2213
 
2.3%
3212
 
2.2%
2412
 
2.2%
2911
 
2.0%
Other values (163)425
76.3%
ValueCountFrequency (%)
101
 
0.2%
112
 
0.4%
125
0.9%
133
 
0.5%
148
1.4%
1511
2.0%
169
1.6%
178
1.4%
189
1.6%
1911
2.0%
ValueCountFrequency (%)
49291
 
0.2%
29461
 
0.2%
16001
 
0.2%
15001
 
0.2%
10502
0.4%
9601
 
0.2%
9501
 
0.2%
8504
0.7%
8441
 
0.2%
7941
 
0.2%

Total_Protiens
Real number (ℝ≥0)

Distinct58
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.51005386
Minimum2.7
Maximum9.6
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:32.317270image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2.7
5-th percentile4.6
Q15.8
median6.6
Q37.2
95-th percentile8.12
Maximum9.6
Range6.9
Interquartile range (IQR)1.4

Descriptive statistics

Standard deviation1.091105059
Coefficient of variation (CV)0.167603077
Kurtosis0.3020396956
Mean6.51005386
Median Absolute Deviation (MAD)0.7
Skewness-0.3374894565
Sum3626.1
Variance1.190510249
MonotocityNot monotonic
2021-04-05T04:50:32.622546image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
732
 
5.7%
6.826
 
4.7%
626
 
4.7%
6.925
 
4.5%
6.224
 
4.3%
7.122
 
3.9%
820
 
3.6%
7.219
 
3.4%
7.318
 
3.2%
6.118
 
3.2%
Other values (48)327
58.7%
ValueCountFrequency (%)
2.71
 
0.2%
2.81
 
0.2%
31
 
0.2%
3.63
0.5%
3.71
 
0.2%
3.82
0.4%
3.92
0.4%
42
0.4%
4.12
0.4%
4.33
0.5%
ValueCountFrequency (%)
9.61
 
0.2%
9.51
 
0.2%
9.22
 
0.4%
8.91
 
0.2%
8.71
 
0.2%
8.63
 
0.5%
8.55
0.9%
8.43
 
0.5%
8.33
 
0.5%
8.28
1.4%

Albumin
Real number (ℝ≥0)

Distinct40
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.156373429
Minimum0.9
Maximum5.5
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:32.724277image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile1.8
Q12.6
median3.1
Q33.8
95-th percentile4.4
Maximum5.5
Range4.6
Interquartile range (IQR)1.2

Descriptive statistics

Standard deviation0.7980978672
Coefficient of variation (CV)0.2528528025
Kurtosis-0.3566704456
Mean3.156373429
Median Absolute Deviation (MAD)0.6
Skewness-0.07879703319
Sum1758.1
Variance0.6369602056
MonotocityNot monotonic
2021-04-05T04:50:32.823043image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
343
 
7.7%
437
 
6.6%
2.929
 
5.2%
3.126
 
4.7%
3.226
 
4.7%
3.925
 
4.5%
3.523
 
4.1%
2.522
 
3.9%
3.421
 
3.8%
3.321
 
3.8%
Other values (30)284
51.0%
ValueCountFrequency (%)
0.92
 
0.4%
11
 
0.2%
1.43
 
0.5%
1.53
 
0.5%
1.68
1.4%
1.73
 
0.5%
1.812
2.2%
1.97
1.3%
217
3.1%
2.114
2.5%
ValueCountFrequency (%)
5.52
 
0.4%
51
 
0.2%
4.94
 
0.7%
4.82
 
0.4%
4.73
 
0.5%
4.64
 
0.7%
4.56
1.1%
4.48
1.4%
4.312
2.2%
4.212
2.2%

Albumin_and_Globulin_Ratio
Real number (ℝ≥0)

Distinct70
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9489873418
Minimum0.3
Maximum2.8
Zeros0
Zeros (%)0.0%
Memory size4.5 KiB
2021-04-05T04:50:32.921779image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.3
5-th percentile0.5
Q10.7
median0.95
Q31.1
95-th percentile1.5
Maximum2.8
Range2.5
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.318525811
Coefficient of variation (CV)0.3356481135
Kurtosis3.476751904
Mean0.9489873418
Median Absolute Deviation (MAD)0.15
Skewness1.014359717
Sum528.5859494
Variance0.1014586923
MonotocityNot monotonic
2021-04-05T04:50:33.028523image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1102
18.3%
0.859
10.6%
0.955
9.9%
0.753
9.5%
1.144
 
7.9%
1.235
 
6.3%
0.631
 
5.6%
1.325
 
4.5%
0.523
 
4.1%
1.417
 
3.1%
Other values (60)113
20.3%
ValueCountFrequency (%)
0.34
 
0.7%
0.351
 
0.2%
0.371
 
0.2%
0.391
 
0.2%
0.414
2.5%
0.451
 
0.2%
0.461
 
0.2%
0.472
 
0.4%
0.481
 
0.2%
0.523
4.1%
ValueCountFrequency (%)
2.81
 
0.2%
2.52
0.4%
1.91
 
0.2%
1.852
0.4%
1.83
0.5%
1.721
 
0.2%
1.74
0.7%
1.661
 
0.2%
1.63
0.5%
1.582
0.4%

Liver_Disease
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size38.7 KiB
Liver Disease
396 
No Liver Disease
161 

Length

Max length16
Median length13
Mean length13.86714542
Min length13

Characters and Unicode

Total characters7724
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLiver Disease
2nd rowLiver Disease
3rd rowLiver Disease
4th rowLiver Disease
5th rowLiver Disease
ValueCountFrequency (%)
Liver Disease396
71.1%
No Liver Disease161
28.9%
2021-04-05T04:50:33.202127image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-05T04:50:33.260970image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
disease557
43.7%
liver557
43.7%
no161
 
12.6%

Most occurring characters

ValueCountFrequency (%)
e1671
21.6%
i1114
14.4%
s1114
14.4%
718
9.3%
L557
 
7.2%
v557
 
7.2%
r557
 
7.2%
D557
 
7.2%
a557
 
7.2%
N161
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5731
74.2%
Uppercase Letter1275
 
16.5%
Space Separator718
 
9.3%

Most frequent character per category

ValueCountFrequency (%)
e1671
29.2%
i1114
19.4%
s1114
19.4%
v557
 
9.7%
r557
 
9.7%
a557
 
9.7%
o161
 
2.8%
ValueCountFrequency (%)
L557
43.7%
D557
43.7%
N161
 
12.6%
ValueCountFrequency (%)
718
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7006
90.7%
Common718
 
9.3%

Most frequent character per script

ValueCountFrequency (%)
e1671
23.9%
i1114
15.9%
s1114
15.9%
L557
 
8.0%
v557
 
8.0%
r557
 
8.0%
D557
 
8.0%
a557
 
8.0%
N161
 
2.3%
o161
 
2.3%
ValueCountFrequency (%)
718
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7724
100.0%

Most frequent character per block

ValueCountFrequency (%)
e1671
21.6%
i1114
14.4%
s1114
14.4%
718
9.3%
L557
 
7.2%
v557
 
7.2%
r557
 
7.2%
D557
 
7.2%
a557
 
7.2%
N161
 
2.1%

Interactions

2021-04-05T04:50:21.989211image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.077039image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.166796image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.260547image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.348311image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.442060image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.529827image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.616593image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.700369image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.787137image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.867921image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:22.950700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.035473image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.239899image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.328689image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.409503image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.491293image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.568088image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.649612image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.742361image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.826935image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:23.923646image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.010441image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.106158image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.198945image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.287700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.374469image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.463230image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.557950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.647737image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.747471image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.838232image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:24.937991image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.033736image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.127039image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.216930image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.309979image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.392992image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.470188image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.557199image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.645988image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.736856image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.822019image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.904333image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:25.982125image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.061884image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.157628image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.247416image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.343159image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.443894image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.536642image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.632386image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.729128image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.818429image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:26.915703image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.001711image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.083464image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.173263image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.264041image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.347817image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.441168image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.533921image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.771132image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.860893image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:27.956905image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.038924image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.128977image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.221253image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.307527image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.402245image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.490011image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.572790image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.661551image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.744359image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.822123image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.905901image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:28.992693image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.073451image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.161247image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.241540image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.322840image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.402643image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.486419image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.566207image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.653971image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.744703image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.827508image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:29.917269image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:30.003039image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-05T04:50:30.086815image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-04-05T04:50:33.320117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-05T04:50:33.465000image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-05T04:50:33.619332image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-05T04:50:33.771924image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-05T04:50:33.907589image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-05T04:50:30.249380image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-05T04:50:30.435427image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexAgeGenderTotal_BilirubinDirect_BilirubinAlkaline_PhosphotaseAlamine_AminotransferaseAspartate_AminotransferaseTotal_ProtiensAlbuminAlbumin_and_Globulin_RatioLiver_Disease
0065Female0.70.118716186.83.30.90Liver Disease
1162Male10.95.5699641007.53.20.74Liver Disease
2262Male7.34.149060687.03.30.89Liver Disease
3358Male1.00.418214206.83.41.00Liver Disease
4472Male3.92.019527597.32.40.40Liver Disease
5546Male1.80.720819147.64.41.30Liver Disease
6626Female0.90.215416127.03.51.00Liver Disease
7729Female0.90.320214116.73.61.10Liver Disease
8817Male0.90.320222197.44.11.20No Liver Disease
9955Male0.70.229053586.83.41.00Liver Disease

Last rows

df_indexAgeGenderTotal_BilirubinDirect_BilirubinAlkaline_PhosphotaseAlamine_AminotransferaseAspartate_AminotransferaseTotal_ProtiensAlbuminAlbumin_and_Globulin_RatioLiver_Disease
54757332Male3.71.661250886.21.90.40Liver Disease
54857432Male12.16.051548926.62.40.50Liver Disease
54957532Male25.013.756041887.92.52.50Liver Disease
55057632Male15.08.228958805.32.20.70Liver Disease
55157732Male12.78.419028475.42.60.90Liver Disease
55257860Male0.50.150020345.91.60.37No Liver Disease
55357940Male0.60.19835316.03.21.10Liver Disease
55458052Male0.80.224548496.43.21.00Liver Disease
55558131Male1.30.518429326.83.41.00Liver Disease
55658238Male1.00.321621247.34.41.50No Liver Disease